CLAIMS 



The invention claimed is: 

1 . A method for providing reliable communication in an interconnected network of data 
processing nodes, said method comprising: 

detecting a failure of nodes or communication links in a system using a heartbeat 
mechanism to indicate to said nodes that at least one of said nodes or said communication links 
are functioning or have failed; 

establishing an instance identifier associated with said failure; 

sending notification of said failure, including said instance identifier, to other nodes 
having existing communication links with said at least one failed node; and 

terminating, at said notified nodes, pending communication links that involve said at least 
one failed node, said termination being carried out in response to said notification. 

2. The method of claim 1 further including the step of detecting that said at least one failed 
node is no longer in a failed state and resuming communications with that node using an 
incremented value for said instance identifier. 

3. The method of claim 2 further including the step of resuming communications with said 
other nodes using said incremented instance identifier. 
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4. A data processing system comprising: 

a plurality of interconnected data processing nodes; 

heartbeat signal generators within each said node for providing a signal to others of said 
nodes indicative of node failure status; 

heartbeat signal detectors within said nodes for indicating that a certain node has failed; 

a first program within said nodes for establishing an instance identifier associated with 
each node failure and for transmitting notification of said failure and said instance identifier to 
nonf ailed nodes; and 

a second program within said nodes for terminating, at said notified nodes, pending 
communication links that involve said at least one failed node, said termination being carried out 
in response to said notification. 

5. The data processing system of claim 4 in which said heartbeat signal detectors also 
provide an indication that a failed node has returned to functioning status. 

6. The data processing system of claim 5 further comprising a third program within said 
nodes which resumes communication with nodes that have returned to functioning status, said 
communication including transmission of a new instance identifier. 
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7. A computer program product comprising a computer readable medium on which is stored 
program means for: 

detecting a failure of nodes or communication links in a system using a heartbeat 
mechanism to indicate to said nodes that at least one of said nodes or said communication links 
are functioning or have failed; 

establishing an instance identifier associated with said failure; 

sending notification of said failure, including said instance identifier, to other nodes 
having existing communication links with said at least one failed node; and 

terminating, at said notified nodes, pending communication links that involve said at least 
one failed node, said termination being carried out in response to said notification. 
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